Cascaded Subpatch Networks for Effective CNNs

نویسندگان

  • Xiaoheng Jiang
  • Yanwei Pang
  • Manli Sun
  • Xuelong Li
چکیده

Conventional convolutional neural networks use either a linear or a nonlinear filter to extract features from an image patch (region) of spatial size Hx W (typically, H is small and is equal to W, e.g., H is 5 or 7 ). Generally, the size of the filter is equal to the size Hx W of the input patch. We argue that the representational ability of equal-size strategy is not strong enough. To overcome the drawback, we propose to use subpatch filter whose spatial size hx w is smaller than Hx W . The proposed subpatch filter consists of two subsequent filters. The first one is a linear filter of spatial size hx w and is aimed at extracting features from spatial domain. The second one is of spatial size 1x 1 and is used for strengthening the connection between different input feature channels and for reducing the number of parameters. The subpatch filter convolves with the input patch and the resulting network is called a subpatch network. Taking the output of one subpatch network as input, we further repeat constructing subpatch networks until the output contains only one neuron in spatial domain. These subpatch networks form a new network called the cascaded subpatch network (CSNet). The feature layer generated by CSNet is called the csconv layer. For the whole input image, we construct a deep neural network by stacking a sequence of csconv layers. Experimental results on five benchmark data sets demonstrate the effectiveness and compactness of the proposed CSNet. For example, our CSNet reaches a test error of 5.68% on the CIFAR10 data set without model averaging. To the best of our knowledge, this is the best result ever obtained on the CIFAR10 data set.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Cascaded Convolutional Neural Network for Single Image Dehazing

Images captured under outdoor scenes usually suffer from low contrast and limited visibility due to suspended atmospheric particles, which directly affects the quality of photos. Despite numerous image dehazing methods have been proposed, effective hazy image restoration remains a challenging problem. Existing learning-based methods usually predict the medium transmission by Convolutional Neura...

متن کامل

Convolutional Gating Network for Object Tracking

Object tracking through multiple cameras is a popular research topic in security and surveillance systems especially when human objects are the target. However, occlusion is one of the challenging problems for the tracking process. This paper proposes a multiple-camera-based cooperative tracking method to overcome the occlusion problem.  The paper presents a new model for combining convolutiona...

متن کامل

Hand Gesture Recognition from RGB-D Data using 2D and 3D Convolutional Neural Networks: a comparative study

Despite considerable enhances in recognizing hand gestures from still images, there are still many challenges in the classification of hand gestures in videos. The latter comes with more challenges, including higher computational complexity and arduous task of representing temporal features. Hand movement dynamics, represented by temporal features, have to be extracted by analyzing the total fr...

متن کامل

Cystoscopy Image Classication Using Deep Convolutional Neural Networks

In the past three decades, the use of smart methods in medical diagnostic systems has attractedthe attention of many researchers. However, no smart activity has been provided in the eld ofmedical image processing for diagnosis of bladder cancer through cystoscopy images despite the highprevalence in the world. In this paper, two well-known convolutional neural networks (CNNs) ...

متن کامل

Deep Recurrent Neural Networks for Human Activity Recognition

Adopting deep learning methods for human activity recognition has been effective in extracting discriminative features from raw input sequences acquired from body-worn sensors. Although human movements are encoded in a sequence of successive samples in time, typical machine learning methods perform recognition tasks without exploiting the temporal correlations between input data samples. Convol...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE transactions on neural networks and learning systems

دوره   شماره 

صفحات  -

تاریخ انتشار 2017